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ABSTRACT 


CAC(Q)  CERs  estimate  the  Cumulative  Avse^e  Cost  ^t  total  production  quantity  Q. 
The  CAC(Q)  equation  form  is  given  by 


CAC(Q)  =  a(Q)»f(X) 


nd/or 


where  Q  is  total  quantity  and  f[X)  is  a  fiinction  of  physical  and/or  p^M|n^ance 
characteristics.  At  first  glance  the  equation  simply  looks  like  a  learning  curve.  However, 
coefficient  b  will  capture  the  learning  curve  effect  and  any  other  quantity  related  effects,  such 
as  degree  of  automation.  Values  for  b  are  in  the  -.2  to  -.4  range  which  is  less  than  a  90% 
learning  curve  typical  of  Solid  Rocket  Motors. 


The  logic  behind  the  CAC(Q)  equation  is  that  the  best  data  is  the  total  constant  year 
cost  and  the  total  quantity  procured.  Any  other  data,  even  individual  lot  buys,  will  have 
anomalies.  To  attempt  to  build  CERs  with  lot  data  or  derive  theoretical  first  unit  costs  (Tjs), 
introduces  noise  into  the  data  that  masks  the  true  relationships.  This  is  especially  true  when 
learning  analysis  on  individual  data  points  is  very  noisy,  (e.g.,  derived  learning  curves  with 
slopes  greater  than  one,  poor  learning  curve  fits,  etc.). 

In  this  paper  the  application  of  the  CAC(Q)  techmque  to  Solid  Rocket  Motors  is 
described.  Three  equations  having  different  cost  drivers  were  derived,  all  with  good  fit 
statistics.  Techniques  for  selecting  among  these  three  "good"  equations  are  also  described 
and  derivation  of  Tj  is  demonstrated. 


Charles  A.  Graver 
Damon  C.  Morrison 
Tecolote  Research,  Inc. 

5290  Overpass  Road,  Bldg.  D 
Santa  Barbara,  CA  93111 
(805)  683-1813 
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PREFACE 


This  paper  contains  excerpts  from  "Cost  Estimating  Solid  Rocket  Motors  with  Thrust 
Vector  Control",  CR-06!7,  Tecolote  Research,  Inc.,  February  1993.  This  is  referred  to  as 
Reference  3.  Proprietary  Data  has  been  removed  so  that  it  C(;uld  be  presented  at  the  DoD 
Cost  Analysis  Symposium.  The  paper  has  also  been  shortened  by  removing  some  of  ♦he 
conversions  contained  in  the  original  paper.  The  resulting  paper  focuses  on  the  development 
of  CAC(Q)  CERs,  selecting  between  them,  and  deriving  a  Tj  cost. 
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INFRODUCTION 


1.1  EURPJQSE 

Tecolote  Research  has  been  investigating  solid  rocket  motor  costs  for  the  United 
States  Army  Space  and  Strategic  Defense  Command  (USASSDC)  Cost  Analysis  Office 
(CAO).  In  a  previous  study  we  developed  Cost  Estimating  Relationships  (CERs)  for  rocket 
propulsion  recurring  manufacturing  costs.  This  effort  was  documented  in  CR-0540,  October 
1991  (Ref.  1). 

The  CERs  reported  in  Ref.  1,  while  significant  in  the  statistical  sense,  left  much  to  be 
desired.  First,  the  data  set  was  very  small.  It  consisted  of  13  motors  from  eight  DoD 
procurements.  The  focus  on  recurring  manufacturing  costs  meant  that  we  had  to  use  CCDR 
data.  Only  13  data  points  could  be  found  that  had  CCDRs. 

Furtliermore,  the  data  set  appeared  to  have  two  strata.  Seven  motors  had  Thrust 
Vector  Controls  (T'/C)  and  the  remaining  motors  were  part  of  missiles  that  had  aerodynamic 
controls,  most  often  actuator-driven  fins.  The  TVC  motors  tended  to  be  larger  and  performed 
strategic  missions,  while  the  aerodynamic  missile  motors  were  smaller  with  tactical  missions. 
In  addition,  the  control  costs  for  the  aerodynamic  systems  could  be  separated  from  the  motor 
costs,  as  the  fins  and  the  actuators  were  located  in  the  aft  section  of  the  missile.  On  the  other 
hand,  part  of  the  TVC  controls  are  an  integral  part  of  the  nozzle,  and  that  part  of  the  control 
cost  could  not  be  separated  from  the  motor  costs.  Hence,  the  data  set  contained  strategic 
motors  with  partial  control  costs  and  tactical  motors  without  control  costs. 

It  was  not  surprising,  therefore,  that  attempts  to  develop  CERs  that  estimated  the 
Theoretical  first  unit  cost  (Tl)  were  not  successful.  The  differences  in  definition  between 
tactical  and  strategic  motors  added  to  the  variation  inherent  in  many  Tl  CER  developments. 
However,  we  were  successful  in  developing  equations  that  estimated  the  cumulative  average 
costs  at  total  buy  quantity,  "Q."  These  equations  are  referred  to  as  CAC(Q)  equations. 

The  logic  behind  CAC(Q)  CERs  is  that  the  best  data  is  the  total  constant  year  cost  and 
the  total  quantity  procured.  Any  other  data,  even  individual  lot  buys,  will  have  anomalies. 
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So  to  attempt  to  build  CERs  with  lot  data,  or  derived  Tls,  introduces  noise  into  the  data  that 
masks  the  true  relationships. 

The  CAC(Q)  equation  form  is  shown  below.  At  first  glance,  it  simply  looks  like  a 
learning  curve,  with  fPQ  representing  some  fimction  of  physical  and  performance 
characteristics  X  that  are  hypothesized  to  explain  the  cost.  However,  the  coefficient  b  on  the 
total  production  quantity  Q  will  include  not  only  the  learning  curve  effect,  but  also  any  other 
effect  that  is  associated  with  quantity.  The  best  example  of  another  quantity-related  effect  is 
the  degree  of  automation.  The  manufacturer  will  automate  a  production  line  more  when  a 
large  total  production  quantity  is  expected.  For  these  large  production  buys,  the  average  cost 
at  total  production  quantity  is  less  than  it  would  be  if  automation  were  the  same  for  all  data 
points  in  the  data  set.  This,  in  turn,  leads  to  a  value  for  the  coefficient  b  that  is  much  less 
than  that  for  a  typical  learning  curve.  Values  in  the  -.2  to  -.4  range  are  common.  While  part 
of  the  coefficient  value  represents  learning  (-.152  for  a  90%  learning  curve),  the  rest  is  due  to 
automation  or  some  other  quantity-related  cause. 

CAC(Q)  =  a(Q)b/(X) 

The  practical  consequence  of  CERs  with  this  form  is  that  you  cannot  directly 
calculate  a  Tl.  Putting  a  1  in  the  equation  for  the  value  of  Q  is  not  an  estimate  of  Tl.  It  is 
only  an  estimate  of  the  Tl  cost  if  you  are  going  to  produce  only  one  unit  and  never  any  more. 
In  effect,  you  would  have  no  automation  and  a  much  higher  cost.  As  a  result,  the  reader  is 
cautioned  as  follows:  DO  NOT  ENTER  1  FOR  Q  TO  ESTIMATE  A  Tl  COST. 

The  correct  way  to  calculate  a  Tl  cost  is  first  to  calculate  a  cumulative  average  cost  at 
total  production  quantity  and  then  to  convert  that  cost  to  Tl  by  applying  a  learning  curve 
with  an  appropriate  slope.  An  example  of  this  calculation  is  given  below. 

Suppose  the  CAC(Q)  equation  in  thousands  of  dollars  is  given  by  the  following, 
where  IT  is  total  impulse  in  thousands  of  pound-seconds. 


CAC(Q)  =  61 .553(Q)-«  5298  (1T)0  6607 


Further,  suppose  that  you  want  to  estimate  the  T1  cost  of  motor  X,  which  has  a  total 
impulse  of  700K.  pound-seconds,  and  that  you  are  going  to  produce  1000  motors.  Finally, 
assume  a  95%  learning  curve.  Then  the  cumulative  average  cost  for  1000  units  is 

CAC(IOOO)  =  61. 353(1000)“® 7000.6607=  $1 19.7K 


T1  is  then  found  by 

CAC(IOOO)  =  Tl(1000)-®07^  or  T1  =  $1 19.7K/.6  =  $199.5K 

The  CAC(Q)  equation  form  was  applied  to  the  solid  rocket  motor  data  set  in  Ref.  1 
with  significantly  better  results  than  those  achieved  with  the  T1  form  CERs.  For  example, 
the  Root  Mean  Square  Error  (RMS)  for  the  T1  equations  ranged  from  72%  to  84%.  For  the 
CAC(Q)  equations  the  RMS  error  dropped  to  around  64%  and  also  identified  an  outlier. 
When  this  outlier  was  removed,  the  RMS  error  dropped  to  27%. 

However,  the  coefficient  on  Q  appeared  to  be  too  high.  Typically,  this  coefficient 
was  less  than  -.5.  Our  concern  was  that  the  results  may  be  spurious.  We  managed  to  tie 
strategic  and  tactical  motors,  (partially)  with  and  without  control  costs,  into  the  same  data  set 
because  all  strategic  motors  had  smaller  total  production  quantities  than  the  tactical  motors. 
The  fear  was  that  the  equation  form  would  give  wrong  results  if  used  to  estimate  the  cost  of 
high  total  quantity  strategic  motors  (above  600)  or  low  total  quantity  tactical  motors  (below 
2000). 


K'o  examine  this  concern,  we  recommended  in  Ref  1  that  the  data  be  stratified  and 
separate  equations  be  built  for  the  two  strata.  If  the  value  of  the  coefficient  b  is  thereby 
reduced  to  the  more  practical  -.2  to  -.4  range,  then  the  individual  strata  equations  should  be 
used  to  project  into  this  middle  tola!  production  quantity  range  of  more  than  600  strategic 
motors  and  fewer  than  2000  tactical  motors. 

The  purpose  of  this  study  was  to  report  on  the  results  of  developing  CERs  for  the 
strategic  motors  with  TVC.  ITie  results  are  significantly  better  than  those  achieved  with  the 
Ref  1  data  set. 
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1.2 


The  first  consideration  in  this  investigation  was  to  pick  the  data  set.  Stratifying  the 
Ref  1  recurring  manufacturing  cost  data  set  left  only  seven  data  points  at  most,  and  one  of 
those  was  suspect,  ’i  hat  motor  had  a  recurring  manufacturing  to  recurring  production  cost 
ratio  that  was  completely  out  of  the  range  of  the  other  data  points.  Its  value  is  0.396,  while 
the  average  of  the  remaining  six  motors  is  0.752  with  a  range  from  0.669  to  0.804.  With  such 
a  small  data  set,  the  results  would  be  tenuous  at  best. 

The  data  set  size  could  be  expanded  to  seven  if  we  investigated  recurring  production 
costs  instead  of  recurring  manufacturing  costs,  as  we  could  add  the  suspect  data  point.  This 
wasn't  much  of  an  improvement  in  database  size. 

However,  we  have  recurring  production  costs  for  other  motors  from  previous  studies, 
and  we  were  able  to  find  recurring  production  costs  on  22  strategic  motors  with  TVC.  This 
wider  data  set  offered  real  potential  to  build  a  useful  and  robust  CER.  This  data  set  was 
selected  for  the  study. 

1.3  ORGANIZATION 

The  recommended  equations  are  presented  in  Section  2.  They  are  CAC(Q)  equations 
that  estimate  the  recurring  production  costs.  Three  equations  are  presented  along  with 
examples  of  their  use.  In  Section  3,  we  examine  the  three  equations  and  give  our  advice  as  to 
which  equation  to  select.  Conclusions  are  reported  in  Section  4. 
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2 

RECURRING  PRODUCTION  EQUATIONS 


2.1  EQUATION  FORM 

The  22-point  dfita  set  was  used  for  CER  development.  All  of  the  data  points  are 
strategic  motors  with  TVC.  The  costs  are  in  thousands  of  FY88  dollars  and  include  the 
motor  and  TVC.  They  also  include  Systems  Engineering  and  Program  Management  (SEPM) 
costs  and,  hence,  are  typica’  of  a  subcontractor  cost  to  the  prime. 

The  equation  form  that  we  used  is 

CAC(Q)  =  aQb(SIZE)':(NUMBERNOZZLES)'ieD'P2 

where  a,  b,  c,  d,  e,  and  f  are  coefficients  to  be  estimated,  Q  is  the  total  production  quantity  to 
be  procured,  CAC(Q)  is  the  cumulative  average  cost  in  thousands  of  FY8!;  dollars,  and  D1 
and  D2  are  dummy  variables  for  material  type,  defined  by 


Material  Type 

D1 

D2 

Kevlar  (Composite) 

1 

0 

Titanium  or  Glass 

0 

1 

All  Other  (Steel) 

0 

0 

The  equation  form  incorporates  Total  Quantity,  Size,  Number  of  Nozzles,  and 
Material  Type.  Three  different  size  variables  were  investigated.  These  are  Total  Weight, 
Nozzle  Weight,  and  Total  Impulse.  All  of  the  size  variables  produced  statistically  significant 
results.  These  are  summarized  on  the  following  page: 


Sample  Size 

22 

Degrees  of  Freedom 

16 

RMS  error 

18.4%- 19.1% 

Adjusted  R2 

91.84%  -  92.75% 

Coefficient  t  Statistics 

All  significant 

b  values 

-0.29  to  -0.36 

As  can  be  seen,  the  results  are  very  good  for  each  of  the  size  variables.  The  problem 
then  becomes  one  of  choosing  between  three  good  equations.  This  will  be  the  subject  of 
Section  3.  CERs  for  each  of  the  size  variables  are  given  in  Sections  2.2,  2.3,  and  2.4 
respectively.  An  example  of  each  equation  use  is  also  given, 

2.2  IQIAL  WEIGHT  EQUATION 

The  total  weight  equation  is  given  by: 


CAC(Q)  =  29.045Q^  TW®  s«26kno6I67  (i  .6680)D'  (1 .3867)02 


where 

CAC(Q) 

Q 

TW 

NN 

D1 

D2 


Cumulative  Average  Cost  in  thousands  of  FY88  dollars 

Total  Production  Quantity 

Total  Weight  in  pounds 

Number  of  Noz-zles 

Kevlar  stiatification  variable 

Titanium  or  Glass  stratification  variable 


The  significant  statistics  for  this  equation  arc  summarized  below; 


Data  Points 
Degrees  of  Freedom 
Standard  Error  (SE) 


22 

16 

0.2267 


R-Squared  (Adj) 
F  Statistic 
RMS  Error 


91.84% 

48.30 

19.1% 


Coefficient  Significance 


Variable 

Coefficient 

t-Statistic 

Probability 

Not  Zero 

Intercept 

a 

5.04 

1.000 

Q 

b 

-5.41 

1.000 

TW 

c 

9.46 

1.000 

NN 

d 

6.96 

1.000 

D1 

e 

3.21 

0.995 

D2 

f 

2.56 

0.979 

Data  Ranges; 

71  <Q<2249 
3200  ^TW<  107000 
1  <NN<4 

One  date  point,  Motor  14,  exhibits  a  32.6%  error  and  is  listed  as  showing  an  unusual 
value  in  the  outlier  analysis.  All  the  other  data  points  are  estimated  within  32%.  Percentage 
errors  for  each  of  the  data  points  are  given  in  Section  3.2. 

As  c  n  be  seen  from  these  statistics,  the  total  weight  equation  is  liighly  significant. 
Furthem  coefficient  on  Q  is  in  the  acceptable  range,  with  a  little  less  than  half  of  the 

quantity  .  0  .  .  devoted  to  learning  curve  slope  (-0.152  for  a  90%  learning  curve). 

As  an  example  of  using  the  equation,  assume  tliat  we  want  to  estimate  the  motor  cost 
of  missile  X.  We  are  going  to  produce  1000  motors.  The  Total  Weight  is  700  pounds,  with  a 
single  nozzle  weighing  30  pounds,  The  material  to  be  used  is  Kevlar.  The  Total  Impulse  is 
300  thousand  pound-second,r. 


The  cumulative  average  cost  of  1000  motors  is  given  by 


CAC(Q)  =  29.045(1000)-o««7(7oo)osi26(1)06167(i.6680)'(1.3867)o 

which,  after  performing  the  arithmetic,  equals  $134K. 

To  calculate  a  T1  cost,  one  caimot  use  a  Q  value  equal  to  one,  but  rather  one  must 
assume  a  learning  curve  slope  in  conjunction  with  the  CAC(Q)  results  at  total  production 
quantity  Q.  Assuming  90%,  we  have  the  following: 

CAC(IOOO)  =  Tl(1000)-o  or  T1  =  134  K/.35  =  383K 

2.3  NOZZLE  WEIGHT  EQUATION 

Tlie  nozzle  weight  equation  is  given  by 

CAC(Q)  =  97.453Q-«^*93NWOJ929nno.4774(ij553)di(i.2601)D2 


where 


CAC(Q)  = 

Q  = 
NW 
NN 
D1 
D2 


Cumulative  Average  Cost  in  thousands  of  FY88  dollars 

Total  Production  Quantity 

Nozzle  Weight  in  pounds 

Number  of  Nozzles 

Kevlar  stratification  variable 

Titamum  or  Glass  stratification  variable 


The  significant  statistics  for  this  equation  are  summarized  below; 

Data  Points  22  R-Squared  (Adj)  92.75% 

Degrees  of  Freedom  16  F  Statistic  54.73 

Standard  Error  (SE)  0.2138  RMS  Error  18.9% 


CoefTicient  Significance 


Variable 

Coefficient 

t-Statistic 

Probability 

Not  Zero 

Intercept 

a 

8.50 

1.000 

Q 

b 

-4.85 

1.000 

NW 

c 

10.13 

1.000 

NN 

d 

5.42 

1.000 

D1 

e 

3.77 

0.998 

D2 

f 

1.92 

0.927 

Data  Ranges; 

71  ^2249 

90  1540 

1  iNNS4 

Two  data  points,  Motors  14  and  8,  exhibit  a  30.8%  and  50,1%  error,  respectively,  and 
are  listed  as  showing  an  unusual  value  in  the  outlier  analysis.  All  the  other  data  points  are 
estimated  within  26%.  Percentage  en’ors  for  each  of  the  data  points  are  given  in  Section  3.2. 

As  can  be  seen  horn  these  statistics,  the  nozzle  weight  equation  is  highly  significant. 
Furthermore,  the  coefficient  on  Q  is  in  the  acceptable  range,  with  a  little  more  than  half  of  the 
quantity  effect  devoted  to  learning  curve  slope  (-0. 1 52  for  a  90%  learning  curve). 

As  an  example  of  using  the  equation,  assume  that  we  want  to  estimate  the  motor  cost 
of  missile  X.  We  arc  going  to  produce  1000  motors.  The  Total  Weight  is  700  pounds,  with  a 
single  nozzle  weighing  30  pounds.  The  material  to  be  used  is  Kevlar.  The  Total  Impulse  is 
300  thousand  pound-seconds. 

The  cumulative  average  cost  of  1000  motors  is  given  by: 


CAC(Q)  =  97.453(10CO)-«2M3(30)05929(1)0.4774(17553)1(i.2601)o 


which,  after  performing  the  arithmetic  equals,  $i74K. 


To  calculate  a  T1  cost,  one  cannot  use  a  Q  value  equal  to  one,  but  rather  one  must 
assume  a  learning  curve  slope  in  conjunction  with  the  CAC(Q)  results  at  total  production 
quantity  Q.  Assuming  90%,  we  have  the  following: 

CAC(1000)  =  Tl(1000)-o'52orTl  =  174K/.35  =  497K 

2.4  TOTAL  IMPULSE  EQUATION 

The  total  impulse  equation  is  given  by 

CAC(Q)  =  77.595Q-0  3597  xp  sosi  6'*6  (1 .4433)'3i(l  .2939)02 


where 


CAC(Q) 

Q 

T1 

NN 

D1 

D2 


=  Cumulative  Average  Cost  in  thousands  of  FY88  dollars 
=  Total  Production  Quantity 

=  Total  Impulse  in  thousands  of  pound-seconds 

=  Number  of  Nozzles 

=  Kevlar  stratification  variable 

=  Titanium  or  Glass  stratification  variable 


1  he  significant  statistics  for  this  equation  are  summarized  below; 

Data  Points  22  R-Squared  (Adj)  92.65% 

Degrees  of  Freedom  16  F  Statistic  53.94 

Standard  Error  (SE)  0.2153  RMS  Error  18.4% 


Coefficient  Significance 


Variable 

Coefficient 

t-Statistic 

Probability 

Not  Zero 

Intercept 

a 

7.79 

1.000 

Q 

b 

-6.07 

1.000 

T1 

c 

10.05 

1.000 

NN 

d 

7.26 

1.000 

D1 

e 

2.36 

0.969 

D2 

f 

2.13 

0.951 

Data  Ranges: 

71  ^Q<2249 
600  <  TI  ^  27000 
1  ^NN^4 

No  data  points  show  an  unusual  value  in  the  outlier  analysis.  All  the  data  points  are 
estimated  within  32.52%.  Percentage  errors  for  each  of  the  data  points  are  given  in  Section 
3.2. 


As  can  be  seen  from  these  statistics,  the  total  impulse  equation  is  highly  significant. 
Furthermore,  the  coefficient  on  Q  is  in  the  acceptable  range,  with  a  little  less  than  half  of  the 
quantity  effect  devoted  to  learning  curve  slope  (-0.152  for  a  90%  learning  curve). 

As  an  example  of  using  the  equation,  assume  that  we  want  to  estimate  the  motor  cost 
of  missile  X.  We  are  going  to  produce  1000  motors.  The  Total  Weight  is  700  pounds,  with  a 
single  nozzle  weighing  30  pounds.  The  material  to  be  used  is  Kevlar.  The  Total  Impulse  is 
300  thousand  pound-seconds. 

The  cumulative  average  cost  of  1000  motors  is  given  by; 


CAC(Q)  =  77.595(1000)^  ””(300)0  50«>(1)°^"^(1.4433)>(1. 2939)0 


which,  after  performing  the  arithmetic,  equals  $132K. 


To  calculate  a  Ti  cost,  one  cannot  use  a  Q  value  equal  to  one,  but  rather  one  must 
assume  a  learning  curve  slope  in  conjunction  with  the  CAC(Q)  results  at  total  production 
quantity  Q.  Assuming  90%,  we  have  the  following: 


CAC(1000)  =  Tl(l000)-0'52orTl  =  132K/.35  =  377K 


3 

SELECTING  A  CER 


Three  very  good  equations  were  presented  in  Section  2.  The  only  difference  in  form 
is  the  size  variable.  How  does  one  choose  a  CER  from  the  equations  based  on  Total  Weight, 
Nozzle  Weight,  or  Total  Impulse?  In  this  section,  we  address  this  question  by  examining  the 
traditional  statistics  (3.1),  comparing  the  fit  for  individual  data  points  (3.2),  and  seeing  how 
well  the  equation  estimates  smaller  motors  (3.3). 

3.1  TRADITIONAL  STATISTICS 

A  number  of  statistical  measures  are  presented  for  each  equation  in  Sections  2.2,  2.3, 
and  2.4.  Infonnation  on  how  well  the  equations  fit,  and  hence  can  predict,  are  summarized  in 
these  statistics.  We  have  selected  four  for  comparison.  These  are  shown  below. 

STATISTICS 


Equation  Based  0.; 

Total  Weight 

Nozzle  Weight 

Total  Impulse 

R2(ADJ) 

91.84% 

92.75% 

92.65% 

Standard  Error 

0.2267 

0.2138 

0.2153 

RMS  Error 

19.1% 

18.9% 

18.4% 

Number  of  Outliers 

1 

2 

0 

From  these  statistics,  it  appears  that  the  Total  Impulse  equation  is  marginally  better. 
Most  significantly,  it  has  no  outliers.  Its  RMS  Error  is  best,  and  it  has  the  second-best 
standard  error  and  Adjusted  R2. 

However,  from  a  statistical  point  of  view,  there  is  really  not  much  difference  between 
the  three  equations.  The  choice,  therefore,  may  depend  most  on  the  information  available  to 
the  cost  estimator.  Are  estimates  of  all  three  size  variables  available,  and  what  confidence  is 
there  in  their  values?  For  example.  Nozzle  W'eight  is  often  not  available  as  early  as  the  other 
two. 
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3.2  AMALQG 

Another  means  of  choosing  is  by  analogy.  In  this  case,  one  selects  a  motor  that  is 
most  like  the  one  being  estimeted.  The  equation  that  has  the  smallest  percent  error  for  the 
selected  motor  is  the  one  that  is  preferred. 


Percent  errors  for  all  the  motors  in  the  database  are  given  in  the  table  below. 
Separate  entries  have  been  made  for  each  of  the  three  equations.  A  positive  entry  in  the  table 
means  that  the  equation  estimated  high.  A  negative  entry  means  that  the  equation  estimated 
low.  Thus,  for  example,  Motor  1  is  estimated  low  by  8.9%  using  the  total  weight  equation. 


PERCENT  ERROR 


Motor 

Equaf'on  Based  On 

Total  Weight 

Nozzle  Weight 

Total  Impulse 

Motor  1 

-8.90 

-16.46 

-8.73 

Motor  2 

-13.55 

-5.37 

-7.98 

Motor  3 

-2.77 

-6.43 

-2.85 

Motor  4 

-15.48 

-11.19 

-16.55 

Motor  5 

-16.07 

-11.38 

-10.91 

Motor  6 

27.87 

24.50 

24.42 

Motor  7 

24.92 

21.52 

23.50 

Motor  8 

1.97 

50.13 

-7.40 

Motor  9 

27.00 

-14.88 

32.52 

Motor  10 

12.21 

1.96 

13.95 

Motor  1 1 

-26.44 

-25.19 

-23.59 

Motor  12 

1.93 

-9.10 

1.12 

Motor  13 

-17.23 

-10.42 

-14.56 

Motor  14 

-32.60 

-30.78 

-29.01 

Motor  15 

13.85 

9.51  1 

11.06 

Motor  16 

22.50 

25.99 

20.54 

Motor  17 

-6.78 

1.09 

-6.61 

Motor  18 

8.57 

14.64 

4.92 

Motor  19 

31.06 

11.43 

32.07 

Motor  20 

-24.23 

-8.81 

-27.54 

Motor  21 

18.28 

3.48 

16.35 

Motor  22 

14.36 

22.94 

12.10 
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If  the  booster  that  the  analyst  needs  to  esiLmate  is  most  similar  to  the  Motor  21,  then 
the  equation  on  Nozzle  Weight  seems  to  be  best.  It  overestimated  the  cost  by  3.48%. 

3.3  EXTRAPOLATION  ESTlMA'flNG 

A  third  means  of  choosing  between  equations  arises  when  the  motor  to  be  estimated 
falls  outside  or  nearly  outside  the  range  of  the  data.  This  ir  true  for  most  of  the  moU  rs  being 
considered  for  TMD  and  NMD  applications.  These  motors  tend  to  be  at  the  low  end  of  the 
database,  i.e.,  they  are  smaller  than  most  or  all  of  the  motors  in  the  data  set.  The  relevant 
question,  then,  is  how  well  does  the  equation  estimate  when  extrapolating  to  smaller  motors. 

To  test  this,  we  ordered  the  motors  in  the  database  by  the  size  variable.  We  then 
dropped  the  smallest  motor  from  the  database  and  refit  the  equation  with  the  remaining  21 
data  points.  We  then  predicted  the  cost  of  the  smallest  motor.  In  this  case.  Motor  20  was 
smallest  for  Total  Weight  and  Total  Impulse.  Motor  19  was  smallest  for  Nozzle  Weight. 
The  Total  Weight  equation  predicted  Motor  20  cost  low  by  34.6%,  while  the  Total  Impulse 
equation  predicted  low  by  39.4%.  The  Nozzle  Weight  equation  predicted  the  Motor  19  cost 
high  by  17.7%. 

Note  that  we  refer  to  these  calculations  as  predictions  rather  then  estimations.  This  is 
to  denote  that  there  is  a  prediction  being  made,  as  the  data  point  in  question  is  not  in  the 
database.  This  is  different  from  the  percent  error  calculations  made  in  regression  analysis, 
where  the  estimate  is  in  reality  a  measurement  of  how  well  the  equation  fit  the  data  point,  as 
the  data  point  was  part  of  the  database.  In  this  sense,  predictions  made  from  the  extrapolation 
estimating  technique  are  a  real  estimate  of  the  error  that  one  would  encounter. 

We  repeated  the  process  described  above  until  the  sample  size  remaining  reached  1 1 , 
which  still  allowed  5  degrees  of  freedom.  At  each  step,  the  smallest  remaining  data  point 
was  dropped,  the  equation  refit,  and  the  most  recently  dropped  data  point  predicted. 
Although  predictions  of  all  the  dropped  data  points  could  be  used,  concentrating  on  the  one- 
step  predictions  has  some  statistical  advantages.  For  example,  it  can  be  shown  under  the 
normal  regression  theory  assumptions  that  the  one-step  predictions  have  statistical  properties 
similar  to  the  database  residuals  in  regression  theory.  For  more  about  this  technique  and  the 
statistical  properties,  see  Ref  2. 


Results  of  this  analysis  for  the  11  smallest  data  points  are  summarized  in  the 
following  table.  The  percentage  error  entries  in  the  table  represent  overestimates  if  positive, 
and  underestimates  if  negative.  The  number  in  parentheses  is  the  sample  size  from  which  the 
estimate  was  made. 


PERCENT  ERROR  IN  PREDICTION 
(Sample  Size  Used  in  Prediction) 


Motor 

Equation  Based  On  | 

Total  Weight 

Nozzle  Weight 

Total  Impulse 

Motor  1 

* 

* 

* 

Motor  2 

* 

* 

-13.5  (11) 

Motor  3 

* 

* 

Motor  4 

-23.9(19) 

-11.1  (16) 

-22.1  (19) 

Motor  5 

-22.6(15) 

-3.1  (13) 

-17.9(14) 

Motor  6 

* 

* 

* 

Motor  7 

31.1  (14) 

35.6(12) 

29.3  (15) 

Motor  8 

* 

* 

« 

Motor  9 

47.6(12) 

-21.6(15) 

53.6(12) 

Motor  10 

* 

* 

* 

Motor  1 1 

* 

* 

* 

Motor  12 

* 

* 

* 

Motor  13 

♦ 

* 

* 

Motor  14 

* 

* 

Motor  15 

♦ 

* 

* 

Motor  16 

* 

68.4(11) 

* 

Motor  17 

-15.1(26) 

3.2  (20) 

-18.9(17) 

Motor  1 8 

7.8(18) 

51.3(14) 

-4.1  (20) 

Motor  19 

36.0(17) 

17.7(21) 

30.5  (18) 

Motor  20 

-34.6  (21) 

-6.9(19) 

-39.4  c21) 

Motor  21 

31.8(13) 

22.3(17) 

19.3  (13) 

Motor  22 

14.6(16) 

34.9(18) 

5.5  (16) 

Sum 

83.5 

190.7 

22.3 

j  Average 

25.5 

25.1 

23.1 

1  Weighted  Average 

24.7 

22.8 

22.8 

*T 


AAx/v 


The  sum  of  the  percent  errors  is  an  indication  of  bias.  In  regression  theory,  the  sum  of 
the  residuals  is  always  zero.  In  extrapolation  estimating,  this  is  not  tlie  case  because  the  data 
is  never  in  the  database  when  the  regression  is  performed.  The  closer  the  sum  of  the  percent 
errors  is  to  zero,  the  less  bias.  In  this  case,  all  three  equations  tend  to  over  predict,  but  the 
Total  Impulse  equation  shows  the  least  bias.  The  average  percent  error  is  calculated  on  the 
absolute  percent  errors.  Here  all  three  equations  are  similar,  with  Total  Impulse  performing 
best.  A  weighted  average  is  also  calculated.  Sample  size  is  used  for  the  weight,  thus  giving 
greater  weight  to  predictions  from  the  larger  databases.  Again,  all  three  equations  perform 
similarly,  but  Total  Weight  is  the  worst. 

Using  extrapolation  estimating,  it  still  seems  that  the  Total  Impulse  equation  is  the 
best.  It  has  lower  average  error  and  less  bias. 

Another  factor  favoring  the  Total  Impulse  equation  is  the  stability  of  the  coefficient 
on  the  size  variable.  The  coefficient  starts  at  0.5081.  As  data  is  removed,  the  coefficient  gets 
as  low  as  0.4788  an  1  as  high  as  0.5574.  There  is  no  real  pattern  to  the  variation,  and  the 
coefficient  value  of  sample  size  11  equation  is  0.5034.  Contrast  this  to  the  coefficient 
behavior  for  the  Total  Weight  equation.  It  starts  at  0.5126,  gets  as  high  as  0.5721,  and 
finishes  for  sample  size  1 1  at  a  low  of  0.4921.  Worse  yet  is  the  Nozzle  Weight  coefficient, 
which  starts  at  0.5929  and  gets  consistently  smaller  until  it  reaches  0.3887  for  sample  size  11. 
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4 

CONCLUSIONS 


The  investigations  reported  in  this  paper  were  very  successful.  As  stated  in  Section  1, 
the  results  from  Ref.  1,  though  statistically  significant,  were  suspicious.  Our  concern  was 
that  we  had  tied  together  two  separate  populations  by  using  the  CAC(Q)  equation  form  and 
that  using  the  equations  to  estimate  large  production  strategic  motors  or  small  production 
tactical  motors  would  give  misleading  results.  The  solution  was  to  build  a  motor  CER  with 
TVe  for  the  strategic  motor  population.  We  hj'pothesized  that  if  the  coefficient  on  the  total 
production  quantity  Q  was  increased  from  the  -.5  range  to  the  -.2,  -.4  range,  then  the  CER 
would  be  much  more  reasonable  for  cost  estimating. 

The  need  for  a  larger  data  set  forced  the  study  to  concentrate  on  a  22  data  point  set  of 
recurring  production  costs  instead  of  a  6  data  point  set  of  recurring  manufacturing  costs.  This 
allowed  us  to  develop  three  size-based  CERs,  all  with  significant  statistics  and  coefficients  on 
Q  in  the  acceptable  range.  Furthermore,  the  statistics  for  these  CERs  were  much  better  than 
those  in  Ref  1 . 

It  is  our  conclusion  that  the  stratification  of  the  data  set  into  motors  with  TVC  was  the 
reason  that  we  obtained  these  good  results.  It  is  our  recommendation  that  these  equations  be 
used,  instead  of  those  in  Ref  1,  to  estimate  motors  with  TVC  in  general  and  especially  wlien 
the  total  production  quantity  is  expected  to  exceed  600. 

The  three  equations  differ  only  in  the  size  variable.  There  is  an  equation  based  on  (1) 
Total  Weight,  (2)  Nozzle  Weight,  and  (3)  Total  Impulse,  Three  ways  of  selecting  the  best  of 
these  equations  are  given  in  Section  3.  One  of  these  techniques  is  based  on  how  well  the 
equation  performs  when  trying  to  estimate  motors  smaller  than  those  in  the  data  set  It  was 
shown  that  the  equations  have  an  average  error  of  around  23  to  25  percent  when  predicting 
smaller  motors  outside  the  data  set.  In  general,  the  Total  Impulse  equation  seems  best. 
However,  the  statistical  quality  of  the  three  equations  are  very  close,  and  the  availability  of 
good  input  specifications  for  die  size  variable  may  be  the  most  miportant  reason  for  choosing 
among  the  equations. 


A  final  word  of  caution  when  using  these  equations.  They  are  CAC(Q)  equations  and 
hence  estimate  the  cumulative  average  cost  at  total  production  quantity.  Do  not  enter  one  for 
quantity  unless  you  are  estimating  a  motor  with  only  one  production  unit.  If  you  do  enter  one 
for  Q,  your  estimate  wl!  be  high  whenever  more  than  one  unit  is  produced  and/or  production 
tooling  has  been  bought.  The  proper  method  to  calculate  T1  costs  was  presented.  This 
shows  the  analyst  how  to  correctly  calculate  a  T1  cost  from  the  CAC(Q)  cost  equations  in 
Section  2. 
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